Efficient sampling in approximate dynamic programming algorithms

نویسندگان

  • Cristiano Cervellera
  • Marco Muselli
چکیده

Dynamic Programming (DP) is known to be a standard optimization tool for solving Stochastic Optimal Control (SOC) problems, either over a finite or an infinite horizon of stages. Under very general assumptions, commonly employed numerical algorithms are based on approximations of the cost-to-go functions, by means of suitable parametric models built from a set of sampling points in the d-dimensional state space. Here the problem of sample complexity, i.e., how “fast” the number of points must grow with the input dimension in order to have an accurate estimate of the cost-to-go functions in typical DP approaches such as value iteration and policy iteration, is discussed. It is shown that a choice of the sampling based on low-discrepancy sequences, commonly used for efficient numerical integration, permits to achieve, under suitable hypotheses, an almost linear sample complexity, thus contributing to mitigate the curse of dimensionality of the approximate DP procedure.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Dynamic Programming Approach for Approximate Uniform Generation of Binary Matrices with Specified Margins

Consider the collection of all binary matrices having a specific sequence of row and column sums and consider sampling binary matrices uniformly from this collection. Practical algorithms for exact uniform sampling are not known, but there are practical algorithms for approximate uniform sampling. Here it is shown how dynamic programming and recent asymptotic enumeration results can be used to ...

متن کامل

Optimal Learning and Approximate Dynamic Programming

Approximate dynamic programming (ADP) has emerged as a powerful tool for tackling a diverse collection of stochastic optimization problems. Reflecting the wide diversity of problems, ADP (including research under names such as reinforcement learning, adaptive dynamic programming and neuro-dynamic programming) has become an umbrella for a wide range of algorithmic strategies. Most of these invol...

متن کامل

Efficient Algorithms for Just-In-Time Scheduling on a Batch Processing Machine

Just-in-time scheduling problem on a single batch processing machine is investigated in this research. Batch processing machines can process more than one job simultaneously and are widely used in semi-conductor industries. Due to the requirements of just-in-time strategy, minimization of total earliness and tardiness penalties is considered as the criterion. It is an acceptable criterion for b...

متن کامل

Feature Selection Using Regularization in Approximate Linear Programs for Markov Decision Processes

Approximate dynamic programming has been used successfully in a large variety of domains, but it relies on a small set of provided approximation features to calculate solutions reliably. Large and rich sets of features can cause existing algorithms to overfit because of a limited number of samples. We address this shortcoming using L1 regularization in approximate linear programming. Because th...

متن کامل

Approximate String Matching with Ordered q-Grams

Approximate string matching with k differences is considered. Filtration of the text is a widely adopted technique to reduce the text area processed by dynamic programming. We present sublinear filtration algorithms based on the locations of q-grams in the pattern. Samples of q-grams are drawn from the text at fixed periods, and only if consecutive samples appear in the pattern approximately in...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Comp. Opt. and Appl.

دوره 38  شماره 

صفحات  -

تاریخ انتشار 2007